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754) information processing appa^tusand input method 

(57) More comfortable data input is implemented by 
using speech recognition and a character prediction 
{ unc3onincombination.Forexample,accord.ngtoada- 

ta input method of this invention, character stnng can- 
didates which follow a character string input by a char- 
acter string input device are predicted (S402), and the 
Sid character string candidates are displayed on 
a display device (S403). Speech recognrt.on s per- 
?oS f or speech input by the speech input dev^e us- 
ing the character string candidates displayed on ,thed.s- 
play device as words to be recognized (S411) and a 
character string serving as the recognition result is con- 
firmed as a character string to be used (S412). 
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Description 

FIELD OF THE INVENTION 

[00011 The present invention relates to a data input 
technique which uses key/button operation and a 
speech recognition function in combination. 



BACKGROUND OF THE INVENTION 
[0002] Data input is required to use many current In- 

use in formation device is probably a character. Charac- 
ter i put gene^^ 

su h as keyboard operation or the like. In particular 
character input in a compact portable terminal such as 
at ular ph'one, PDA. or the like which has a limited 
number of keys and buttons requires a larger number of 
toes of input operations involving key/button presses 
In that in' a pereonal computer or the like with many 

mOM] To increase the eff iciency in such troublesome 
character string input, there implemented a character 
nput method which has an inputpredictionun^io n£ 
so referredto as an AutoCompletefunct.on).Th,s unc 

tion predicts and presents a character string candidate 

which follows an input character string. 

[0005] With the input prediction function character 

string nput can be complete by selecting a desired one 
(I any) from presented character string candidates If 
e pVe tetion performance is high, every character 
string can be input without inputting the entire character 
string By presenting character strings having under- 
5 one kana'kanii conversion as candidates for each 
character string, kana-kanji conversion operation can 

also be omitted. . » u «* M r 

[0006] Thereareproposedmanytechniquesthatper- 

ain to character string input prediction for supporting 
character input (e.g., see Japanese Patent Laid^pen 
Nos 08-235318 and 08-255158 and POBox (Predic 
2 Operation Based On eXample)", URL: http^. 
csl sonyco.jp/person/masui/OpenPOBox/index. html). 
0007] Techniques for supporting character stnng in- 
put aKernative to the above-mentioned input prediction 
include speech recognition. Since the use of speech 
eognil basicaily eliminates the need ^yopera- 
tion to input a character string, those unsk led ,ri , key 
operation can efficiently input a character stnng. Ateo 
speech recognition is effective in inputting a character 
string in a device with a limited number of keys. 
[0008] An input prediction technique is also imple- 
mented in a compact portable terminal such as acellular 
phone PDA.orthe.ikewhich is recently becommgmore 
histicated, and is very convenient. However w e 
a plurality of character strings are presented as candi- 
dates an operation of selecting a desired one may be- 
c ome complicated. Particuiarly, to select a character 



strina only by cursor movement operation and scroll op- 
eration, an operation of moving the position of a cursor 
needs to be repeated until the cursor reaches the de- 
sired character string. When many candidates are pre- 
s sented, the number of times of operations increases 
[00091 Speech recognition techniques have recently 
mproved in performance. A dictation software program 
which handles severalten thousand words allows com- 
fortable character string input on a high-performance 
10 computer in a relatively quiet environment such as an 
office. However, since speech recognition which han- 
dles several ten thousand words requires many compu- 
ter resources (CPU and memory), comfortable opera- 
Son cannot be expected even in an existing compact 
15 portable terminal whose performance has been en- 
nanced. Additionally, an existing dictation software pro- 
oram does not offer satisfactory recognition perfonm- 
anceinaplacewherebackgroundnoiseisloudandthus 
cannot offer its real performance outdoors where a com- 
20 pact portable terminal is often used. 

P 0010] in consideration of the use environment and 
esources of a PDA, it is the best way to min.rn.2e the 
number of recognizable words in orderto ******* 
response speed which does not apply any stress to the 
25 user However.amerereductioninthenumberofwords 
ow rs the recognition rate and disables inputting of a 
desired character string without correcting operation. It 
is rather difficult to comfortably input a character string 
in a compact portable terminal using only speech rec- 
30 ognition by a currently used technique^ 

moil] As another problem, homophones canno be 
distinguished from each other using only speech^ More 
specially, whether either of "son" and "sun (both of 
which have the same pronunciation) can be adopted as 
35 the notation for speech input /sAn/ cannot be deter- 
mined from the speech input. 



SUMMARY OF THE INVENTION 

40 ro012] The present invention has as an aim imple- 
m t comforta'bie data input using a character stnng 
prediction function and speech recognftion in combma- 

mo n i31 An information processing apparatus accord- 
45 ngto neembodimentofthepresentinventionaddress- 
es the above-mentioned problems by havmgthefoHow- 
ina arrangement. That is, the information processing ap- 
paraurcomprises prediction means for predicting at 
P eroneharactercandidatewhichfo,lowsatleas^ 

50 input character, display control means for contro ,g 
displaying the at least one character candidate predict- 
ed prediction means, speech recognition means 
for performing speech recognition for input speech us 
ing the at least one displayed character candidate as a 
55 wordtoberecognized.andconfirmationmeansforcon. 

Ling the recognition result from the speech recogn,- 
Sl7 According to another embodiment of the 
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present invention, there is provided a data input method 
in an information apparatus, comprising a prediction 
step of predicting at least one character candidate which 
follows at least one character input by a character input 
device, a display control step of displaying the at least 
one character candidate predicted in the prediction step 
on a display device, a speech recognition step of per- 
forming speech recognition for speech input by a 
speech input device using the at least one character 
candidate displayed on the display device as a word to 
be recognized, and a confirmation processing step of 
confirming, as at least one character to be used, at least 
one character serving as a recognition result obtained 
in the speech recognition step. 
[0015] Other and further objects, features and advan- 
tages of the present invention will be apparent from the 
following descriptions taken in conjunction with the ac- 
companying drawings, in which like reference charac- 
ters designate the same or similar parts throughout the 
figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0016] The accompanying drawings, which are incor- 
porated in and constitute a part of the specification, il- 
lustrate embodiments of the invention and, together with 
the descriptions, serve to explain the principle of the in- 
vention. 



Fig. 1 is a block diagram showing the arrangement 
of a cellular phone set according to an embodiment; 
Fig 2 is a block diagram showing the functional ar- 
rangement of a process that pertains to character 
input in the cellular phone set according to the em- 
bodiment; 

Fig. 3 is a view showing an example of the layout 
of buttons of an input device according to the em- 
bodiment; 

Fig. 4 is a flowchart showing the flow of a character 
string input process according to the embodiment; 
Fig 5 is a chart showing the transition of the display 
contents of a display device during the character in- 
put process; 

Fig. 6 is a flowchart showing the flow of a process 
performed to confirm a character string after check- 
ing a recognition result; 

Fig. 7 is a chart for explaining a process performed 
when speech recognition in character string selec- 
tion causes a recognition error; 
Fig. 8 is a flowchart showing the flow of a process 
of presenting character string candidates according 
to the third embodiment; and 
Fig. 9 is a chart for explaining an example of pres- 
entation of character string candidates according to 
the third embodiment. 



DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

[0017] Preferred embodiments of the present inven- 
5 tion will now be described in detail in accordance with 
the accompanying drawings. 

(First Embodiment) 

10 [001 8] An example will be described below wherein a 
data input apparatus according to the present invention 
is applied to a cellular phone set. The present invention, 
however, is not limited to a cellular phone set and can 
be applied to any information processing apparatus that 

15 requires character input from the user, including a port- 
able terminal such as a PDA, personal computer, and 
the like. 

[0019] Fig. 1 is a block diagram showing the arrange- 
ment of a cellular phone set according to the first em- 

20 bodiment. 

[0020] Referring to Fig. 1 , reference numeral 1 01 de- 
notes a control memory (ROM); 102, a central process- 
ing unit (CPU); 103, a memory (RAM); 104, an external 
storage device; 105, an input device comprising a key, 
25 button, and the like; 106, a display device such as a liq- 
uid crystal monitor; 107, a speech input device (micro- 
phone); 108, a speech output device (speaker); and 
1 09, a bus. As shown in Fig. 1 , the external storage de- 
vice 1 04 stores a control program 11 0 for implementing 
30 the cellular phone set according to this embodiment, 
character string prediction data 209 for character string 
prediction, speech recognition data 210 including 
acoustic models required to perform speech recogni- 
tion and the like. The character string prediction data 
35 209 is formed using a kana-kanji conversion dictionary, 
the character input history of the user, and the like. The 
control program 110 and data 209 and 210 are loaded 
in the RAM 1 03 through the bus 1 09 under the control 
of the CPU 1 02 and are executed by the CPU 1 02. They 
40 may be stored in the ROM 101 instead of the external 
storage device 104. 

[0021] Fig. 2 is a block diagram showing the functional 
arrangement of a process that pertains to character in- 
put in the cellular phone set according to the embodi- 
45 ment. 

[0022] An operation input unit 201 detects operation 
with a button or the like, including character input, which 
is performed by the user using the input device 105. 

[0023] A character string candidate prediction unit 
so 202 predicts character string candidates which follow a 
character string input by the user while referring to the 
character string prediction data 209. 
[0024] A presentation method determination unit 203 
determines a method of presenting the predicted char- 
55 acter string candidates. 

[0025] A candidate classification unit 204 classifies 
the predicted character string candidates into a plurality 
of groups in accordance with the determined presenta- 
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tion method. 

[0026] A character string candidate presentation unit 
205 displays character string candidates on the display 
device 106 in accordance with the determined presen- 
tation method. 

[0027] An alternative candidate notification unit 206 
notifies the user that there are more candidates other 
than the presented ones when candidates are classified 
into a plurality of groups and are presented by group. 
[0028] A candidate switching unit 207 detects the us- 
er's operation of switching character string candidates 
to be presented from one group to another and switches 
the candidates to be presented when candidates are 
classified into a plurality of groups and are displayed for 
each group. 

[0029] A speech recognition unit 208 performs 
speech recognition which handles character string can- 
didates presented by the character string candidate 
presentation unit 205 as words to be recognized. The 
pronunciation of each word to be recognized is deter- 
mined with reference to the character string prediction 
data 209. 

[0030] A speech synthesizer 21 1 generates a synthet- 
ic sound to present data or give various kinds of notifi- 
cations to the user by voice. 

[0031] A character string selection method presenta- 
tion unit 212 presents the user a method of selecting a 
desired one from presented character string candidates. 
[0032] A selection method determination unit 21 3 de- 
tects the user's operation of selecting the character 
string selection method and determines the character 
string selection method. 

[0033] A controller 200 controls the above-mentioned 
modules and controls the entire process that pertains to 
character input. 

[0034] Fig. 3 is a view showing an example of the lay- 
out of buttons of the input device 105. 
[0035] Reference numerals 301 and 302 denote con- 
centrically arranged buttons. The button 301 serving as 
the outer ring is mainly used to designate the moving 
direction of a cursor (up, down, left, and right). The but- 
ton 301 will be denoted by symbols "t", °i\ "<-", and 
"->" hereinafter. The inner ring central button 302 is 
mainly used to confirm a selected candidate in character 
string selection. The button 302 will be denoted by a 
symbol hereinafter. Reference numerals 303 to 306 
denote buttons. The function of each button changes in 
accordance with the state transition of the process in 
character string processing. The buttons 303 to 306 will 
be denoted by symbols " >", "* \ "*'\ and "#", respec- 
tively. 

[0036] A character string input process according to 
the embodiment will be described with reference to Figs. 
4 and 5. Fig. 4 is a flowchart showing the flow of the 
character string input process according to the embod- 
iment; and Fig. 5, a chart showing the transition of the 
display contents of the display device 106 during the 
character string input process. Since known techniques 



can be used to perform character string candidate pre- 
diction and speech recognition, a detailed description 
thereof will be omitted. 

[0037] A case will be described wherein the user in- 
s puts a character string Thank you so much." Assume 
that the user has already input a character string "Thank 
you u and is about to input a subsequent character string 

"so". „• ♦ 

[0038] After the character string "Thank you is input, 

10 the display contents of the display device 106 are as 
denoted by reference numeral 510 of Fig. 5. 
[0039] The user inputs the first character V to input 
the character string "so" (step S401). When the opera- 
tion input unit 201 detects that the character V is input, 
15 the character string candidate prediction unit 202 refers 
to the character string prediction data 209 and predicts 
character string candidates which follow the character 
"s" (step S402). As described above, the character 
string prediction data 209 is formed using the character 
20 input history of the user, a kana-kanji conversion diction- 
ary which indicates the correspondence between hira- 
gana characters and kanji characters, and the like. 
Since a plurality of characters are generally assigned to 
one button in a cellular phone, character strings begin- 
25 ningwith a character "p", "q", V, or"s" may be predicted 
as character string candidates when a button "PQRS 
is pressed once. 

[0040] Predicted character string candidates are pre- 
sented on the display device 1 06 by the character string 
30 candidate presentation unit 205 (step S403). At this 
time, the presentation method determination unit 203 
may define a character string presentation order. For ex- 
ample, if the character input history of the user is used 
as the character string prediction data 209, character 
35 strings may be displayed in order of decreasing frequen- 
cy or in reverse chronological order (a character string 
input latestfirst). If the number of the predicted character 
string candidates is large, the presentation order may 
be determined using any of the above-mentioned cnte- 
40 ria andoniyapredeterminednumberofcharacterstnng 
candidates may be displayed. As another method, the 
number of character string candidates which can be dis- 
played at a time may be calculated from the size of a 
screen area for character string candidate presentation, 
45 and only the calculated number of character string can- 
didates may be displayed. In step S403, the character 
string selection method presentation unit 212 may 
- present a character string selection-method, simultane- 
ously with the presentation of the character string can- 
so didate selection method. 

[0041] Assume that character strings "safe", "save , 
"say" "see", M so", "show", and "step" are predicted as 
character string candidates in step S402 in response to 
the input of the character "s". This embodiment will also 
55 describe a case wherein character string selection by 
speech recognition and character string selection by 
button operation are used in combination. 
[0042] Reference numeral 520 in Fig. 5 shows how 
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predicted character string candidates are presented. In 
this example, a window is split in two, and a character 
string being input is displayed in a character input win- 
dow 521 while character string candidates are displayed 
in a character string selection window 522. An operation 
button guide 523 for designating the character string se- 
lection method by the character string selection method 
presentation unit 21 2 is displayed together with the dis- 
play of the character string candidates. To indicate that 
the current object to be operated is the window where 
character input is performed, the character input window 
is highlighted by, e.g., changing the background color. 
The presentation method determination unit 203 con- 
trols presentation of character string candidates and ad- 
ditional display of the operation button guide. 
[0043] If one desired by the user is not among the pre- 
sented character string candidates in step S404, the 
flow returns to step S401 to prompt the user to further 
input acharacter string. On the other hand, if thedesired 
one is among them, the flow advances to step S405. 
[0044] in step S405, the user shifts to an operation of 
selecting the desired character string. If the user selects 
to use speech recognition, the flow advances to step 
S409. On the other hand, if the user selects to use button 
operation, the flow advances to step S406. 
[0045] A case will be described first wherein selection 
is performed by button operation. The user operates to 
select character string selection by button operation 
(step S406). The selection method determination unit 
213 detects the operation, and subsequent character 
string selection is performed by button operation. In this 
embodiment, button operation is selected by pressing 
the button "★" denoted by reference numeral 304 in 
Fig. 3. Reference numeral 530 in Fig. 5 shows the dis- 
play contents when button selection is selected. In the 
display contents 530, to indicate that the object to be 
operated shifts to the window where the character string 
candidates are displayed, an area to be highlighted 
shifts from the character input window to the character 
string selection window, and a cursor is displayed at the 
position of the first character string candidate "safe". At 
this time, the function of the button "★" is changed to 
"Back (the object to be operated shifts to the character 
input window)". 

[0046] The user selects the desired character string 
by button operation (step S408). Referring to Fig. 5, to 
select the target character string "so", the user presses 
the portions "A" and of the button 301 and moves 
the cursor position to the character string "so". Refer- 
ence numerals 540 and 550 denote the screen transition 
during this operation. 

[0047] In step S408, the user operates to confirm a 
character string to be used. When the user presses the 
button denoted by reference numeral 302 while the 
display contents 550 are displayed, the character string 
-so" being selected is confirmed as the character string 
to be used. Reference numeral 560 denotes the screen 
afterthe character string "so" is confirmed. The case has 



been described wherein one is selected from predicted 
character string candidates by button operation. 
[0048] A case will be described next wherein the user 
selects to use speech recognition in step S405. The user 
5 operates to select character string selection by speech 
recognition (step S409). The selection method determi- 
nation unit 213 detects the operation, and subsequent 
character string selection is performed by speech rec- 
ognition. 

10 [0049] In this embodiment, speech recognition is se- 
lected by pressing the button V denoted by reference 
numeral 303. When the user presses the button V 
while the display contents 520 in Fig. 5 are displayed, 
the area to be highlighted shifts from the character input 
15 window to the character selection window. Note that 
since character selection is not performed by cursor 
movement in the case of speech recognition, no cursor 
is displayed on the character string selection window. 
The user utters a desired character string "so" (step 
20 S410). The speech recognition unit 208 performs 
speech recognition for the utterance of the user (step 
S411) and confirms, as a character string to be used, 
the resultant character string serving as the recognition 
result (step S41 2). The speech recognition in step S41 2 
25 handles only characters presented by the character 
string candidate presentation unit 205 as words to be 
recognized. The speech recognition unit 208 deter- 
mines the pronunciation of each word to be recognized 
with reference to the character string prediction data 
30 209. The screen transition when a character string is se- 
lected by speech recognition is represented by the tran- 
sition from display contents 570 in Fig. 5 to the display 
contents 560. 

[0050] As described above, according to this embod- 
35 iment, character string candidates which follow a char- 
acter input using the input device 1 05 are displayed, and 
a character string to be used can be selected from the 
character sting candidates by speech recognition. This 
makes it possible to greatly reduce troublesome button 
40 operation. Since the speech recognition in this embod- 
iment handles only displayed character string candi- 
dates as words to be recognized, its computational 
quantity becomes small. Therefore, even if such speech 
recognition is implemented by, e.g., a compact portable 
45 terminal, theportableterminalcan operate at sufficiently 
high speed while keeping high recognition rate. 



(Second-Embodiment) 

so [0051] In the first embodiment, a character string to 
be used is confirmed in step S412 without the user 
checking the result of speech recognition in step S411 . 
In this case, if utterance is miss-recognized as a char- 
acter string different from a desired one, the wrong char- 

55 acter string is confirmed as the character string to be 
used. To avoid this, a step of checking a recognition re- 
sult is necessary. Under the circumstances, this embod- 
iment will describe an example with reference to Fig. 6 
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15 



wherein a character string to be used is confirmed after 
checking a recognition result. 
[0052] Fig 6 is a flowchart showing the flow of a proc- 
ess performed to confirm the character string to be used 
after checking a recognition result. Fig. 6 shows only 
processing blocks alternative to steps S41 0 to S412 be- 
tween D and E in the flowchart of Fig. 4. The contents 
of the remaining processing blocks are the same as 
those in the first embodiment, and a description thereof 
will be omitted. Only the processes between D and E, 
which are diff erentf rom those in Fig. 4, will be descnbed. 
[0053] As described in the first embodiment, when the 
user utters a desired character string "so" (step S601), 
the speech recognition unit 208 performs speech rec- 
ognition for the utterance (step S602) and presents the 
recognition resuft (step S603). From this presentation, 
the user can determine whether the result is correct 
(step S604). If the recognition result is incorrect, the flow 
returns to step S601 . In step S601 , the user utters the 
desired character string "so" again (step S601). The 
processes in steps S601 to S604 are repeated until a 
correct recognition result is obtained. If a correct recog- 
nition result is obtained in step S604, the user operates 
to confirm a character string to be used. The user con- 
firms the obtained recognition result as the character 
string (step S605). 

[0054] Fig. 7 shows the screen display transition ac- 
cording to this embodiment. 
[0055] Fig. 7 shows a case wherein speech recogni- 
tion for character string selection causes a recognition 
error in inputting a character string "so" of a character 
string "Thank you so much." in the same manner as in 
the first embodiment. 

[0056] Reference numeral 710 shows a state wherein 
a character string 'Thank you " is confirmed. When the 
user inputs a character "s", character string candidates 
predicted from the character "s» are presented in the 
same manner as in the first embodiment (720). To use 
speech recognitionforcharacterstringselection.the us- 
er presses a button ">", and speech recognition starts 
(730) Reference numeral 740 denotes a display in step 
S603 when an utterance "so" of the user is miss-recog- 
nized as acharacterstring "show". In this display exam- 
ple the recognition result is presented by moving a cur- 
sor' to a character string corresponding to the recogni- 
tion result out of presented character string candidates 
(the character string is underlined). A recognition result 
presentation method is not limited to this. For example, 
the recognition result may be presented by highlighting 
the character string corresponding to the recognition re- 
sult Since the presented recognition result "show" is not 
the target one ("so"), the user utters the character string 
■so" again (NO in step S604 and then step S601). Ref- 
erence numeral 750 denotes a state wherein the second 
utterance is correctly recognized, and the character 
string "so" is presented as the recognition result (steps 
S602 and S603). Since the recognition result is correct, 
the user presses a button "•" and confirms the character 



string (step YES in step S604 and then step S605). 
When the character string is confirmed, a displayed win- 
dow where predicted character string candidates are 
displayed disappears, and the window returns to a char- 
s acter input window (760), as described in the first em- 
bodiment. 

[0057] As described above, according to this embod- 
iment if speech recognition for character string selec- 
tion causes a recognition error, the user can utter any 
10 number of times until a correct recognition result is ob- 
tained. This makes it possible to easily correct a recog- 
nition error. 



(Third Embodiment) 



[0058] In the above-mentioned embodiments, all pre- 
dicted character string candidates are presented or a 
predetermined number of ones out of many character 
string candidates are presented. The embodiments do 
20 not take into consideration presentation when predicted 
character string candidates include a plurality of char- 
acter string candidates whose pronunciations are the 
same. This embodiment will describe character string 
candidate presentation considering this case. 
25 [00591 This embodiment is characterized in that char- 
acter string candidates are classified into a plurality of 
groups, and the candidates are presented over a plural- 
ity of times, if the number of the predicted character 
string candidates is large or if the character string can- 
30 didates include character string candidates whose pro- 
nunciations are the same. The processing will be de- 
scribed in detail with reference to Fig. 8. 
[0060] Fig. 8 is a flowchart showing the flow of a proc- 
ess of presenting character string candidates according 
35 to this embodiment. The flowchart shows a part of Fig. 
4 Only processing blocks alternative to steps S403 and 
S404 between A and B in the flowchart of Fig. 4. The 
contents of the remaining processing blocks are the 
same as those in the first embodiment, and a descnpt.on 
40 thereof will be omitted. Only the processes between A 
and B, which are different from those in Fig. 4, will be 
described. Note that this embodiment can be combined 
with the processing described in the second embodi- 

45 [0061] In step S801, it is determined whether the 
number of character string candidates predicted in step 
S402 of Fig. 4 is larger than a predetermined number 
' N If the number of the character string candidates is 
largerthan N, the process in step S803 and subsequent 
so processes are performed to present the character string 
candidates over a plurality of times. The number N is 
the number of candidates to be presented at a time. The 
number N may be determined in advance. Alternatively, 
the number of candidates which can be presented at a 
ss time may be calculated from the number of characters 
of the predicted character string candidates and the size 
of a display area for presentation every character string 
prediction. 
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[0062] ifthenumberofcandidatesisequaltoorsmatl- 
er than N, the flow advances to step S802. It Is deter- 
mined in step S802 whether the candidates include 
homophones. For example, if the character string can- 
didates include ones whose pronunciations are the 
same such as character strings "stake" and "steak", a 
character string cannot be selected uniquely by speech 
recognition. Accordingly, a process of presenting char- 
acter string candidates over a plurality of times is per- 
formed in step S803 and subsequent steps. A presen- 
tation method determination unit 203 performs the de- 
termination of a character string candidate presentation 
method. If the number of character string candidates is 
equal to or smaller than N, and the character string can- 
didates do not include ones whose pronunciations are 
the same, the flow advances to steps S808 and S809. 
These steps are the same as steps S403 and S404, re- 
spectively, in Fig. 4, and a description thereof will be 
omitted. The determination processes in steps S801 
and S802 are performed by the presentation method de- 
termination unit 203. 

[0063] In step S808, a candidate classification unit 
204 classifies the character string candidates into a plu- 
rality of groups. In classification, for example, the char- 
acter string candidates may be extracted by N in order 
of decreasing frequency at a time. Alternatively, the 
character string candidates may be arranged in alpha- 
betical order and be extracted by N at a time to form a 
group Note that the classification must be performed 
such that a single group does not include character 
string candidates whose pronunciations are the same. 
As another method, a classification criterion which in- 
creases the degree of acoustic separation of character 
string candidates in each group from each other is pref- 
erably employed in order to increase the precision of 
speech recognition to be performed in subsequent 
processing. 

[0064] In step S804, a group to be presented to the 
user is selected. At this time, one with the highest fre- 
quency is selected in the case of classification in order 
of decreasing frequency. In the case of classification in 
alphabetical order, the first group in alphabetical order 
is similarly selected. If degree of acoustic separation is 
used as a criterion, a group with the highest degree of 
acoustic separation is selected. 
[0065] The flow advances to step S805. In step S805, 
a character string candidate presentation unit 205 
presents character string candidates of the selected 
group on a display device 106, and an alternative can- 
didate notification unit 206 notifies the user that there 
are more character string candidates other than the pre- 
sented character string candidates. A character string 
selection method presentation unit 212 presents char- 
acter string selection methods, as described in the first 
embodiment. 

[0066] The user determines in step S806 whether the 
presented character string candidates include a desired 
character string. If the presented character string can- 



didates include the desired one, the flow advances to 
step S405 in Fig. 4 to perform an operation of selecting 
the desired character string from the presented charac- 
ter string candidates in the same manner as in the first 
5 embodiment. On the other hand, if the presented char- 
acter string candidates do not include the desired one, 
the flow advances to step S807. In step S807, the user 
selects another group or the user returns to step S401 
to input the next character. If the user selects another 
w group, a candidate switching unit 207 detects group se- 
lection operation of the user and switches candidates to 
be presented to ones of the group selected by the user. 
The flow returns to step S805 to repeat the same pro- 
cedure. 

15 [0067] Fig. 9 shows based on the procedure de- 
scribed in this embodiment an example of presentation 
and how candidates to be presented are switched when 
predicted character string candidates include ones 
whose pronunciations are the same. 
20 [0068] Fig. 9 shows an example of character string 
candidate presentation when the user wants to input a 
character string "I wantto have steak" and inputs a char- 
acter string "st" to input a character string "steak" after 
a character string "I want to have ". Assume that the 
25 number N of character string candidates to be presented 
at a time is set to 8. 

[0069] Reference numeral 91 0 denotes how the char- 
acter string "I want to have " is confirmed. Assume that 
five character string candidates "stack", "stadium", 
30 "stake", "star", and "steak" are obtained from the char- 
acter string "st" input by the user (steps S401 and S402). 
Since N = 8, the flow shifts to step S802. Since the pre- 
dicted character string candidates include two character 
string candidates "stake" and "steak" whose pronunci- 
35 ations are the same (step S802), the character string 
candidates are classified into two groups (a group of 
"stack", "stadium", "stake", and "star" and a group of 
"steak") in alphabetical order such that the character 
strings "stake" and "steak" belong to different groups 
40 (stepS803). 

[0070] The group of "stack", "stadium", "stake , and 
"star" which is the first in alphabetical order, is selected 
as a group to be presented (step S804), and the select- 
ed character string candidates are presented to the user 
45 (step S805). At the same time, the alternative candidate 
notif ication unit 206 notifies the user that there are more 
candidates other than the presented character string 
- candidates (step S805). Reference numeral 920 de- 
notes the state. A guide "# Next Candidates" denoted 
so by reference numeral 921 is an example of notification 
by the alternative candidate notification unit 206. 
[0071] Since the desired character string "steak" is 
not presented at this time, the user presses a button "#" 
and selects to display other candidates to view them 
55 (step S806). The candidate switching unit 207 detects 
candidate switching operation by the user and selects 
the next candidate(s) ("steak") selected by the user, .. 
e the next group as the group to be presented (step 
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S804) Reference numeral 930 denotes an example 
wherein the character string "steak" is presented to the 
user Since there are the character string candidates of 
the f irst presented group other than the character string 
-steak" being presented, a guide » Previous Candi- 
dates" denoted by reference numeral 922 is displayed 
in addition to the guide "# Next Candidates" denoted by 
reference numeral 921 . These guides indicate thatthere 
are more character string candidates (step S805) A 
process of selecting a character string from presented 
character string candidates and confining the charac- 
ter string is performed in accordance with the procedure 
described in the first or second embodiment. 
m072] As described above, according to this embod- 
iment if predicted character string candidates include 
homophones, the character string candidates are clas- 
sified into groups such that the homophones belong to 
different groups, and one group is displayed at a time. 
This makes it possible to uniquely define a recognition 
result for each group and omit selection operat.on by the 
ussr 

[0073] In this embodiment, if predicted character 
string candidates include a plurality of identical charac- 
ter string candidates whose pronunciations are different 

from each other such as character strings "read (/n:d/) 
and "read" (/red/) in presenting predicted character 
string candidates, the presentation method determina- 
tion unit 203 may select and present one of them. As- 
sume that the character string candidates include char- 
acter strings "read" (/rr.d/), "read" (/red/) and red (/ 
red/). In this case, the character stnngs read (/red/), 
and "red" (/red/) have the same pronunciation (these 
words are acoustically the same, and thus the degree 
of acoustic separation is 0), and the character string 
"read" (/rid/) is selected from the character stnngs 
"read" (/red/) and "read" (/ri:d/). The character strings 
"read" (/ri:d/) and "red" (/red/) are presented as charac- 
ter string candidates. 

[0074] With this process, selection operation by the 
user can be omitted. 

(Other Embodiment) 



[0075] In the above-mentioned embodiments, only 
character strings are presented in presenting predicted 
characterstring candidates. The present invention is not 
limited to this. If each character string has a pronuncia- 
tion as in, e.g., Japanese, the pronunciatio j , ol leach 
character string candidate may be presented together 
with the character string candidate. Also, the pronunci- 
ation of each characterstring candidate may be present- 
ed only when the character string candidate .nc udes 
charactersotherthan kana characters. Additionally, the 
pronunciation of each character string including only ka- 
na characters candidate may be presented when the no- 
tation is different from the pronunciation. Moveover, 
character string candidates and their equivalents in an- 
other language may be presented together. 



[0076] In one example the pronunciations of all char- 
acter string candidates are presented. In another exam- 
ple only the pronunciations of character string candi- 
dates each of which includes a character otherthan h.ra- 
5 gana and katakana characters are presented For ex- 
ample, since a character string "7 A TA ?* <; on f te 
of katakana characters, the pronunciation of the char- 
acter string is not presented. There is an exception to 
this rule. In another example the pronunciations of char- 
w acterstringcandidateseachconsistingofonlyhiragana 

and katakana characters is presented when the notation 
of each character sting candidate is different from the 
pronunciation. For example, although a character string 
...^ , y » consists of only katakana characters, the 
15 pronunciation of the character string is presented be- 
cause the pronunciation is "/kyanon/" . 
[00771 The above-mentioned embodiments have de- 
scribed a case wherein presentation of character string 
candidates, presentation of a recognition result, and a 
20 notification thatthere are other candidates are displayed 
on a display device 106. The present invention is not 
,imitedtothis.Aspeechsynthesizer211 may synthesize 
speech, and a speech output device 108 may present 
the synthesized speech by voice. 
25 [0078] The above-mentioned embodiments have de- 
scribed character string input as one embodiment. The 
present invention is not limited to this. The present in- 
vention can be practiced in an apparatus which inpute 
data in otherforms (e.g., a case wherein image data is 
30 input with a name given to the image). 

[0079] As described above, a data input apparatus 
according to the present invention uses speech recog- 
nition and prediction of data which can be input ir , com- 
bination and selects desired data from predicted data 
35 candidates by speech recognition. This allows ^more ef- 
ficient data input than data input using only data predic- 
tion or speech recognition. 

[0080] If predicted data candidates include a plurality 
of data whose pronunciations are the same, these data 
40 areclassifiedintoapluralityofgroupsandarapresented 

by group such that data candidates to be presented at 
a time do not include data whose pronunciations are the 
same This makes it possible to uniquely select desired 
data by speech recognition and increases the conven- 
45 ience when speech recognition is used for data input. 
[0081 1 The above-mentioned embodiments have de- 
scribed an example wherein only displayed character 
string candidates are handled as words to^be recog- 
nized. The present invention is not limited to this. A char- 
so acter string which is not displayed among predated 
character string candidates may be handled as words 
to be recognized. 

[00821 Note that the present invention can be applied 
to an apparatus comprising a single device orto system 
55 constituted by a plurality of devices. 

[0083] Furthermore, the invention can be implement- 
ed by supplying a software program, which implements 
the functions of the foregoing embodiments, directly or 
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indirectly toasystem or apparatus, reading thesupphed 
proSm code with a computer of the system or appa- 
ratus, and then executing the program a* h Mta. 
case so long as the system or apparatus has the func- 
tions of the program, the mode of implements need 
not relv upon a program. 

r00841 Accordingly, since the functions of the present 
nvenion are implemented by computer, the pmgram 
code installed in the computer also implements the 
present invention. In other words, the claims of the 
Sresentinventionalsocoveracomputerprogramforthe 

purpose of implementing the functions of the present in- 

moS"' In this case, so long as the system or appara- 
tus has the functions of the program, the program may 
be executed in any form, such as an object code, a pro- 
gram executed by an interpreter, or scnp data supplied 
to an operating system. 

[00861 Example of storage media that can be usedfor 
supplying the program are a floppy dtek, a hard disk, an 
opSca. disk, a magneto-optical disk, a CD-ROM a 
CD-R a CD-RW, a magnetic tape, a non-volatile type 
Memory card, a ROM. and a DVD (DVD-ROM and a 

r00871 } As for the method of supplying the program a 
client computer can be connected to a website on the 
internet using a browser of the client computer, and the 
computer program of the present invention or an auto- 
matically-installable compressed file of the program can 

be downloaded to a recording medium such as a hard 
disk. Further, the program of the present invention can 
be supplied by dividing the program code const.tut ng 
the program into a plurality of files and download.ngthe 
files from different websites. In other words, a WWW 
(World Wide Web) server that downloads, to multiple us- 
ers theprogram files that implementthefunctionsof the 
present invention by computer is also covered by the 
claims of the present invention. rotheinro 
f00881 It is also possible to encrypt and store the pro- 
gram of the present invention on a storage medium such 
as a CD-ROM, distribute the storage medium to users 
allow users who meet certain requirements to download 
decryption key information from a website via the Inter- 
net and allow these users to decrypt the encrypted pro- 
gram by usingthe key information, whereby the program 
is installed in the user computer. 
- [0089] Besides the cases where the aforementioned 
functions according to the embodiments are implement- 
ed by executing the read program by computer, an op- 
erating system or the like running on the computer may 
perform all or a part of the actual processing so that the 
functions of the foregoing embodiments can be imple- 
mented by this processing. ., rnmth(1 
[0090] Furthermore, after the program read from the 
storage medium is written to afunction expansion board 
inserted into the computer or to a memory provided in 
a function expansion unit connected to the computer, a 
CPU or the like mounted on the function expansion 



board or function expansion unit performs all or a part 
of the actual processing so that the functions of the fore- 
going embodiments can be implemented by this 

5 EST'S man V apparently widely different embodi- 
ments of the present invention can be made without de- 
parting from the spirit and scope thereof, it is to be un- 
derstood that the invention is not limited to the specify 
embodiments thereof except as defined in the append- 
10 ed claims. 



Claims 

15 1. An information processing apparatus comprising: 

prediction means for predicting at least one 
character candidate which follows at least one 
input character; . 
display control means for controlling displaying 
said at least one character candidate predicted 
by said prediction means; 
speech recognition means for performing 
speech recognition for input speech using said 
at least one displayed character candidate as 
a word to be recognized; and 
confirmation means for confirming, as at least 
onecharactertobe used, at least one character 
serving as a recognition result obtained by the 
30 speech recognition means. 

2. The apparatus according to claim 1 , further com- 

PnS ' n control means forcontrolling said display con- 
as trol means and speech recognition means to high- 
light said at least one character serving as the rec- 
ognition result obtained by the speech recognition 
means out of said at least one character candidate 
displayed by said display control means .n order to 
40 make a user confirm whether the recognition result 
is correct and perform speech recognition in this 
state for utterance given again, 

wherein said confirmation means confirms, as 
said at least one characterto be used, a recognition 
45 result at a time when it is detected that a predeter- 
mined button is pressed, under a control of said 
control means. 

3. The apparatus according to claim 1 , further com- 

PnS,n classification means for, if said at least one 
character candidate predicted by said prediction 
means includes homophones, classifying said a 
,east one character candidate into a plurality of 
55 groups such that the homophones belong to differ- 
ent groups, , 

wherein said display control means controls 
displaying said at least one character candidate by 
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the group formed by said classification means. 

4 The apparatus according to claim 1 , wherein said 
speech recognition means determines a pronunci- 
ation of a word to be recognized on the basis of 
character prediction data used by said prediction 
means. 

5. A data input method in an information processing 
apparatus, comprising: 

a prediction step of predicting at least one char- 
acter candidate which follows at least one char- 
acter input by a character input device; 
a display control step of controlling displaying 
said at least one character candidate predicted 
in the prediction step on a display device; 
aspeech recognition step of performing speech 
recognition for speech input by a speech input 
device using said at least one character candi- 
date displayed on the display device as a word 
to be recognized; and 

a confirmation step of confirming, as at least 
onecharactertobeused, at least onecharacter 

serving as a recognition result obtained in the 
speech recognition step. 

6. The method according to claim 5, further compris- 

1119 a control step of controlling processes in the so 
display control step and speech recognition step to 
highlight said at least one character serving as the 
recognition result obtained in the speech recogni- 
tion step out of said at least one character candidate 
displayed on the display device to make a user con- 
firm whether the recognition result is correct and 
perform speech recognition in this state for utter- 
ance given again by the speech input device, 

wherein in the confirmation step, a recognition 
result at a time when it is detected that a predeter- 
mined button of the character input device is 
pressed is confirmed as said at least one character 
to be used, under a control in the control step. 

7. The method according to claim 5, further compris- 

- a classification step of, if said at least one 

charactercandidate predicted in the prediction step 
includes homophones, classifying said at least one 
character candidate into a plurality of groups such 
that the homophones belong to different groups, 

wherein in the display control step, said at 
least one character candidate is displayed on the 
display device by the group formed in the classifi- 
cation step. 

8 The method according to claim 5, wherein in the 
speech recognition step, a pronunciation of a word 



to be recognized is determined on the basis of char- 
acter prediction data used in the prediction step. 

9. A program executed by a computer, comprising 
codes of: 

a prediction step of predicting at least one char- 
acter candidate which follows at least one char- 
acter input by a character input device; 
a display control step of controlling displaying 
said at least one character candidate predicted 
in the prediction step on a display device; 
a speech recognition step of performing speech 
recognition for speech input by a speech input 
device using said at least one character candi- 
date displayed on the display device as a word 
to be recognized; and 

a confirmation step of confirming, as at least 
one character to be used, at least one character 
serving as a recognition result obtained in the 
speech recognition step. 

10. A computer-readable storage medium which stores 
a program defined in claim 9. 
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11. An information processing apparatus comprising: 

a plurality of keys for the input of symbols, 
wherein each of at least some of the keys is 
operable for the input of a plurality of different 

symbols; . 
a predictive text generator responsive to actu- 
ation of the keys to predict symbols intended 
by the user; 

a display control means for controlling display 
of said predicted symbol; 
speech recognition means for performing 
speech recognition on an input speech signal 
using said at least one predicted symbol to gen- 
erate a recognised symbol; and 

when said display controller is operable to 
control the display in dependence upon the recog- 
nised symbol. 

12. A portable cellular communication device compris- 
ing: 

a plurality of keys for the input of symbols, 
wherein each of at least some of the keys is 
operable for the input of a plurality of different 
symbols; 

a keyboard processor operable to generate text 
data in dependence upon the actuation of one 
or more of said keys by a user; 
an automatic speech recogniser operable to 
recognise an input speech signal and to gener- 
ate a recognition result; and 
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a controller responsive to the text data gener- 
ated by said keyboard processor and respon- 
sive to said recognition result generated by said 
automatic speech recogniser to generate text. 
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